Maximum likelihood inference of small trees in the presence of long branches.

نویسندگان

  • Sarah L Parks
  • Nick Goldman
چکیده

The statistical basis of maximum likelihood (ML), its robustness, and the fact that it appears to suffer less from biases lead to it being one of the most popular methods for tree reconstruction. Despite its popularity, very few analytical solutions for ML exist, so biases suffered by ML are not well understood. One possible bias is long branch attraction (LBA), a regularly cited term generally used to describe a propensity for long branches to be joined together in estimated trees. Although initially mentioned in connection with inconsistency of parsimony, LBA has been claimed to affect all major phylogenetic reconstruction methods, including ML. Despite the widespread use of this term in the literature, exactly what LBA is and what may be causing it is poorly understood, even for simple evolutionary models and small model trees. Studies looking at LBA have focused on the effect of two long branches on tree reconstruction. However, to understand the effect of two long branches it is also important to understand the effect of just one long branch. If ML struggles to reconstruct one long branch, then this may have an impact on LBA. In this study, we look at the effect of one long branch on three-taxon tree reconstruction. We show that, counterintuitively, long branches are preferentially placed at the tips of the tree. This can be understood through the use of analytical solutions to the ML equation and distance matrix methods. We go on to look at the placement of two long branches on four-taxon trees, showing that there is no attraction between long branches, but that for extreme branch lengths long branches are joined together disproportionally often. These results illustrate that even small model trees are still interesting to help understand how ML phylogenetic reconstruction works, and that LBA is a complicated phenomenon that deserves further study.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Long Branch Effects Distort Maximum Likelihood Phylogenies in Simulations Despite Selection of the Correct Model

The aim of our study was to test the robustness and efficiency of maximum likelihood with respect to different long branch effects on multiple-taxon trees. We simulated data of different alignment lengths under two different 11-taxon trees and a broad range of different branch length conditions. The data were analyzed with the true model parameters as well as with estimated and incorrect assump...

متن کامل

Accurate Inference for the Mean of the Poisson-Exponential Distribution

Although the random sum distribution has been well-studied in probability theory, inference for the mean of such distribution is very limited in the literature. In this paper, two approaches are proposed to obtain inference for the mean of the Poisson-Exponential distribution. Both proposed approaches require the log-likelihood function of the Poisson-Exponential distribution, but the exact for...

متن کامل

Techniques for Assessing Phylogenetic Branch Support: A Performance Study

The inference of evolutionary relationships is usually aided by a reconstruction method which is expected to produce a reasonably accurate estimation of the true evolutionary history. However, various factors are known to impede the reconstruction process and result in inaccurate estimates of the true evolutionary relationships. Detecting and removing errors (wrong branches) from tree estimates...

متن کامل

Inference for the Type-II Generalized Logistic Distribution with Progressive Hybrid Censoring

This article presents the analysis of the Type-II hybrid progressively censored data when the lifetime distributions of the items follow Type-II generalized logistic distribution. Maximum likelihood estimators (MLEs) are investigated for estimating the location and scale parameters. It is observed that the MLEs can not be obtained in explicit forms. We provide the approximate maximum likelihood...

متن کامل

Inference on Pr(X > Y ) Based on Record Values From the Power Hazard Rate Distribution

In this article, we consider the problem of estimating the stress-strength reliability $Pr (X > Y)$ based on upper record values when $X$ and $Y$ are two independent but not identically distributed random variables from the power hazard rate distribution with common scale parameter $k$. When the parameter $k$ is known, the maximum likelihood estimator (MLE), the approximate Bayes estimator and ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • Systematic biology

دوره 63 5  شماره 

صفحات  -

تاریخ انتشار 2014